Prediction of low recognition rate words for isolated word recognition system
نویسندگان
چکیده
This paper describes an efficient method to predict words whose recognition rates are low. This method is based on the idea that the minimum value of the word pair recognition rates corresponds to the word recognition rate. The word pair recognition rates can be calculated by the measured distributions of phoneme log likelihood difference. The proposed method was evaluated by recognition experiments using about 3000 word pairs. The correlation coefficient between the predicted and the measured recognition rates was 0.87 when the phoneme lengths of both words of the word pair were equal. Furthermore, we also estimated a 95% confidence interval for the measured recognition rates, and the percentage of the predicted words that were contained in the confidence interval was 94.8%. The results showed the effectiveness of the proposed method for predicting the word pair recognition rates.
منابع مشابه
تشخیص دستنوشتۀ برخط فارسی با استفاده از مدل زبانی و کاهش قوانین نگارش کاربر
The Joint-up, cursive form of Persian words and immense variety of its scripts, also different figures of Persian letters depending on their sitting positions in the words, have turned the Persian handwritings recognition to an intense challenge. The major obstacle of the most often recognition ways, is their inattention to sentence contexture which causes utilizing of a word with correct appea...
متن کاملHolistic Farsi handwritten word recognition using gradient features
In this paper we address the issue of recognizing Farsi handwritten words. Two types of gradient features are extracted from a sliding vertical stripe which sweeps across a word image. These are directional and intensity gradient features. The feature vector extracted from each stripe is then coded using the Self Organizing Map (SOM). In this method each word is modeled using the discrete Hidde...
متن کاملSpoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting
Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...
متن کاملWritten word recognition by the elementary and advanced level Persian-English bilinguals
According to a basic prediction made by the Revised Hierarchical Model (RHM), at early stages of language acquisition, strong L2-L1 lexical links are formed. RHM predicts that these links weaken with increasing proficiency, although they do not disappear even at higher levels of language development. To test this prediction, two groups of highly proficie...
متن کاملFuzzy Clustering Approach Using Data Fusion Theory and its Application To Automatic Isolated Word Recognition
In this paper, utilization of clustering algorithms for data fusion in decision level is proposed. The results of automatic isolated word recognition, which are derived from speech spectrograph and Linear Predictive Coding (LPC) analysis, are combined with each other by using fuzzy clustering algorithms, especially fuzzy k-means and fuzzy vector quantization. Experimental results show that the...
متن کامل